Clustering Chinese Product Features with Multilevel Similarity

نویسندگان

  • Yu He
  • Jiaying Song
  • Yuzhuang Nan
  • Guohong Fu
چکیده

This paper presents an unsupervised hierarchical clustering approach for grouping co-referred features in Chinese product reviews. To handle different levels of connections between co-referred product features, we consider three similarity measures, namely the literal similarity, the word embedding-based semantic similarity and the explanatory evaluation based contextual similarity. We apply our approach to two corpora of product reviews in car and mobilephone domains. We demonstrate that combining multilevel similarity is of great value to feature normalization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extract Product Features in Chinese Web for Opinion Mining

In sentiment analysis of product reviews, one important problem is to extract people's opinions based on product features. Through the summary of feature-level opinions, different consumers can choose their favorite products according to the features that they care about. At the same time, manufacturers can also improve the product features based on the opinions. Different words may be used to ...

متن کامل

Fast Multilevel Clustering

Clustering is a difficult problem. Clustering data may differ by a variety of aspects (dimensionality, cluster size, noise, etc), and the criterion for clustering may depend on the context in which the data is given. We present a multilevel approach for clustering, easily adaptable to handle various kinds of data by identifying desired underlying features of the data. The scheme we present is g...

متن کامل

Suffix Tree Based Chinese Document Feature Extraction and Clustering in RSS Aggregator

In RSS aggregator, the important issue is how to make the feeds information more manageable for RSS subscriber. In this paper, we propose a suffix tree based RSS feeds document clustering in Chinese RSS aggregator. We construct a suffix tree with meaningful Chinese words, and choose the phrases with high score given by a formula as document features. We cluster document using group-average algo...

متن کامل

Offline Recognition of Chinese Handwriting by Multifeature and Multilevel Classification

One of the most challenging topics is the recognition of Chinese handwriting, especially offline recognition. In this paper, an offline recognition system based on multifeature and multilevel classification is presented for handwritten Chinese characters. Ten classes of multifeatures, such as peripheral shape features, stroke density features, and stroke direction features, are used in this sys...

متن کامل

Polygonal Clustering Analysis Using Multilevel Graph-Partition

Existing methods of spatial data clustering have focused on point data, whose similarity can be easily defined. Due to the complex shapes and alignments of polygons, the similarity between non-overlapping polygons is important to cluster polygons. This study attempts to present an efficient method to discover clustering patterns of polygons by incorporating spatial cognition principles and mult...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015